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1 Viewing instruction set design as an optimization problem 
Bruce K. Holmer, Alvin M. Despain 

September 1991 Proceedings of the 24th annual international symposium on 
M ic roa rch i tectu re 

Full text available: ^| pdf(1.04 MB) Additional Information: full citation , references , citings , index terms 



2 Compiler optimizations for power, performance: Architectural analysis and instruction- Q 
set optimization for design of network protocol processors 
Haiyong Xie, Li Zhao, Laxmi Bhuyan 

October 2003 Proceedings of the 1st IEEE/ACM/IFIP international conference on 
Hardware/software codesign & system synthesis 

Full text available: ^ pdf(87.43 KB) Additional Information: full citation , abstract , references , index terms 

TCP/IP protocol processing latency has been an important issue in high-speed networks. In 
this paper, we present an architectural study of TCP/IP protocol. We port the TCP/IP 
protocol stack from the 4.4 FreeBSD to the SimpleScalar simulation environment. The 
architectural characteristics, such as instruction level parallelism and cache behavior, are 
studied through simulation. We also compare the characteristics of TCP/IP protocol to that of 
SPECint benchmark programs. It turns out that the form ... 

Keywords: TCP/IP protocol, architecture simulation, instruction optimization, network 
processing 



Instruction set mapping for performance optimization 
M. Corazao, M. Khalaf, L. Guerra, M. Potkonjak, J. Rabaey 

November 1993 Proceedings of the 1993 IEEE/ACM international conference on 
Computer-aided design 

Full text available: S pdf(397.31 KB) Additional Information: full citation , references , citings 



4 LLVA: A Low-level Virtual Instruction Set Architecture 

Vikram Adve, Chris Lattner, Michael Brukman, Anand Shukla, Brian Gaeke 
December 2003 Proceedings of the 36th Annual IEEE/ACM International Symposium on 
Microarchitecture 
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A virtual instruction set architecture (V-ISA) implementedvia a processor-specific software 
translation layercan provide great flexibility to processor designers. Recentexamples such as 
Crusoe and DAISY, however, haveused existing hardware instruction sets as virtual 
ISAs,which complicates translation and optimization. In fact,there has been little research 
on specific designs for a virtuallSA for processors. This paper proposes a novel virtuallSA 
(LLVA) and a translation strategy for implementi ... 

An ASIP instruction set optimization algorithm with functional module sharing 
constraint 

Alauddin Alomary, Takeharu Nakata, Yoshimichi Honma, Masaharu Imai, Nobuyuki Hikichi 
November 1993 Proceedings of the 1993 IEEE/ACM international conference on 
Computer-aided design 

Full text available: ^ pdf(605.63 KB) Additional Information: full citation , references , citings 



6 Instruction Set Design and Optimizations for Address Computation in DSP | 
Architectures 

Guido Araujo, Ashok Sudarsanam, Sharad Malik 

November 1996 Proceedings of the 9th International Symposium on System Synthesis 

Full text available: fB pdf(886.84 KB) 

jST Additional Information: full citation , abstract , citings 

W Publisher Site 

In this paper we investigate the problem of code generation for address computation for 
DSP processors. This work is divided into four parts. First, we propose a branch instruction 
design which can guarantee minimum overhead for programs that make use of implicit 
indirect addressing. Second, we give a formulation and propose a solution for the problem of 
allocating address registers (ARs) for array accesses within loop constructs. Third, we 
describe retargetable approaches for auto-increment (de ... 
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A tool for processor instruction set design 
Bruce K. Holmer 

September 1994 Proceedings of the conference on European design automation 

Full text available: ^pdf(718.01 KB) Additional Information: full citation , references , citings , index terms 



8 An approach to microprogram optimization considering resource occupancy and 
instruction formats 

Mario Tokoro, Eiji Tamura, Kazuhiko Takase, Kiichiro Tamaru 

October 1977 Proceedings of the 10th annual workshop on Microprogramming 

Full text available* t Bpdf(1.09 MB) Additional Information: full citation , abstract , references , citings , index 
^ terms 

This paper describes a microprogram optimization technique considering resource occupancy 
and microinstruction format. This technique is applicable to machines whose microoperation 
occupies several machine cycles on a submachine cycle basis, and whose microinstruction 
format varies from horizontal to partially encoded, to vertical. "Microtemplate" is proposed 
to represent fetch timing and period of resource usage for a microoperation on a machine 
cycle and submachine cycle basis ... 
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Retargetable tools for embedded software: Instruction set compiled simulation: a 
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technique for fast and flexible instruction set simulation 
Mehrdad Reshadi, Prabhat Mishra, Nikil Dutt 

June 2003 Proceedings of the 40th conference on Design automation 

Full text available: ^pdfd 98.91 KB) Additional Information: full citation , abstract, references , index terms 

Instruction set simulators are critical tools for the exploration and validation of new 
programmable architectures. Due to increasing complexity of the architectures and time-to- 
market pressure, performance is the most important feature of an instruction-set simulator. 
Interpretive simulators are flexible but slow, whereas compiled simulators deliver speed at 
the cost of flexibility. This paper presents a novel technique for generation of fast instruction 
set simulators that combines the benefit ... 

Keywords: compiled simulation, instruction abstraction, instruction set architectures, 
interpretive simulation 



10 Synthesis for Low Power: Efficient instruction-level optimization methodology for low- 
power embedded systems 

Kyu-won Choi, Abhijit Chatterjee 

September 2001 Proceedings of the 14th international symposium on Systems 
synthesis 

Full text available: ^ pdf(331.32 KB) Additional Information: full citation , abstract , references , index terms 

In this paper, for low-power embedded systems, we solve the instruction scheduling and 
reordering problem as a Precedence Constrained Hamiltonian Path Problem for DAGs and 
the Traveling Salesman Problem (TSP), both of which are NP-Hard [1,2], We propose an 
efficient instruction-level optimization algorithm for solving the NP-Hard problem. Minimum 
spanning tree (MST) and simulated annealing (SA) mechanisms are used for the 
optimization. We describe the methods for generating the control flow and ... 

11 Co-synthesis of pipelined structures and instruction reordering constraints for 
instruction set processors 

Ing-Jer Huang 

January 2001 ACM Transactions on Design Automation of Electronic Systems 

(TODAES), Volume 6 Issue 1 
Full text available: ^ pdfd .58 MB) Additional Information: full citation , abstract , references , index terms 

This paper presents a hardware/software co-synthesis approach to pipelined ISP (instruction 
set processor) design. The approach synthesizes the pipeline structure from a given 
instruction set architecture (behavioral) specification. In addition, it generates a set of 
reordering constraints that guides the compiler back-end (reorderer) to properly schedule 
instructions so that possible pipeline hazards are avoided and throughput is improved. Co- 
synthesis takes place while resolving ... 

Keywords: compiler instruction optimization\, instruction set processor, pipeline hazards, 
pipeline taxonomy, synthesis 



12 Instruction prefetching of systems codes with layout optimized for reduced cache 
misses 

Chun Xia, Josep Torrellas 

May 1996 ACM SIGARCH Computer Architecture News , Proceedings of the 23rd 

annual international symposium on Computer architecture, Volume 24 issue 2 
Full text available* fi3 pdf(1.65 MB) Additional Information: full citation , abstract , references , citings , index 
* ^ terms 

High-performing on-chip instruction caches are crucial to keep fast processors busy. 
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Unfortunately, while on-chip caches are usually successful at intercepting instruction fetches 
in loop-intensive engineering codes, they are less able to do so in large systems codes. To 
improve the performance of the latter codes, the compiler can be used to lay out the code in 
memory for reduced cache conflicts. Interestingly, such an operation leaves the code in a 
state that can be exploited by a new type of ... 

13 Energy efficient microarchitectural techniques: Energy-efficient instruction set | 
synthesis for application-specific processors 
jong-eun Lee, Kiyoung Choi, Nikil D. Dutt 

August 2003 Proceedings of the 2003 international symposium on Low power 
electronics and design 

Full text available: ^ pdf(78.03 KB) Additional Information: full citation , abstract , references , index terms 

Several techniques have been proposed to enhance the energy-efficiency of ASIPs 
(Application-Specific Instruction set Processors). While those techniques can reduce the 
energy consumption with a minimal change in the instruction set (IS), they fail to exploit the 
opportunity of designing the entire IS from the energy-efficiency perspective. In this paper, 
we present an energy-efficient IS synthesis technique that can comprehensively reduce the 
energy-delay product (EDP) of ASIPs through optimal ... 

Keywords: application-specific instruction set processor (ASIP), customization, energy- 
delay product, instruction encoding, low power 



14 Architectural exploration and system simulations: An efficient retaraetable framework Q 
for instruction-set simulation 

Mehrdad Reshadi, Nikhil Bansal, Prabhat Mishra, Nikil Dutt 

October 2003 Proceedings of the 1st IEEE/ACM/IFIP international conference on 
Hardware/software codesign & system synthesis 

Full text available: pdf(251.51 KB) Additional Information: full citation , abstract , references , index terms 

Instruction-set architecture (ISA) simulators are an integral part of today's processor and 
software design process. While increasing complexity of the architectures demands high 
performance simulation, the increasing variety of available architectures makes 
retargetability a critical feature of an instruction-set simulator. Retargetability requires 
generic models while high performance demands target specific customizations. To address 
these contradictory requirements, we have developed a gener ... 

Keywords: architecture description language, decode algorithm, generic instruction model, 
instruction binary encoding, retargetable instruction-set simulation 



15 Efficient instruction encoding for automatic instruction set design of configurable ASIPs Q 
Jong-eun Lee, Kiyoung Choi, Nikil Dutt 

November 2002 Proceedings of the 2002 IEEE/ACM international conference on 
Computer-aided design 

Full text available: 1 B P dff356.60 KB) Additional Information: full citation , abstract, references , citings, index 

terms 

Application-specific instructions can significantly improve the performance, energy, and code 
size of configurable processors. A common approach used in the design of such instructions 
is to convert application-specific operation patterns into new complex instructions. However, 
processors with a fixed instruction bitwidth cannot accommodate all the potentially 
interesting operation patterns, due to the limited code space afforded by the fixed 
instruction bitwidth. We present a novel instruction ... 

16 Register connection: a new approach to adding registers into instruction set 
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architectures 

Tokuzo Kiyohara, Scott Mahlke, William Chen, Roger Bringmann, Richard Hank, Sadun Anik, 
Wen-Mei Hwu 

May 1993 ACM SIGARCH Computer Architecture News , Proceedings of the 20th 

annual international symposium on Computer architecture, Volume 21 issue 2 
Full text available* ffo Ddfd 07 MB) Additional Information: full citation , abstract , references , citings , index 
'^^"^ terms 

Code optimization and scheduling for superscalar and superpipelined processors often 
increase the register requirement of programs. For existing instruction sets with a small to 
moderate number of registers, this increased register requirement can be a factor that limits 
the effectless of the compiler. In this paper, we introduce a new architectural method for 
adding a set of extended registers into an architecture. Using a novel concept of connection, 
this method allows the data stored in ... 

17 Program optimization for instruction caches 
S. McFarling 

April 1989 ACM SIGARCH Computer Architecture News , Proceedings of the third 
international conference on Architectural support for programming 
languages and operating systems, Volume 17 issue 2 

Full text available* Hi Ddf(953 55 KB} Additional Information: full citation , abstract , references , citings , index 
. : terms 

This paper presents an optimization algorithm for reducing instruction cache misses. The 
algorithm uses profile information to reposition programs in memory so that a direct- 
mapped cache behaves much like an optimal cache with full associativity and full knowledge 
of the future. For best results, the cache should have a mechanism for excluding certain 
instructions designated by the compiler. This paper first presents a reduced form of the 
algorithm. This form is shown to produce an optimal ... 

18 An analysis of a mesa instruction set using dynamic instruction frequencies 
Gene McDaniel 

March 1982 Proceedings of the first international symposium on Architectural support 
for programming languages and operating systems 

Full text available* pdf(948 65 KB) Additional Information: full citation , abstract , references , citings , index 
* ! terms 

The Mesa architecture is implemented on a variety of processors, and dynamic instruction 
frequency data for two programs is used to analyze the architecture in an implementation 
independent fashion. The Mesa compiler allocates variables in an order based upon their 
static frequency of use, and measurements are provided that show that these static 
predictions predict run time usage as well. We provide an evaluation of the advantages and 
costs of Mesa's compact byte encoding, its r ... 

19 The design of an instruction set for common LISP 
Skef Wholey, Scott E. Fahlman 

August 1984 Proceedings of the 1984 ACM Symposium on LISP and functional 
programming 

Full text available' HeI odf(675 19 KB) Additional Information: full citation , abstract , references , citings , index 
"ser 6 -- 1 : terms 

The design of a microcoded instruction set for executing Common Lisp is presented. The 
influence that the language design, the machine, and the operating system had on this 
design is described. A statistical analysis of object code for an earlier instruction set was 
used to assign specific instruction lengths that led to a significant compression of the object 
code. 
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Storage assignment optimizations to generate compact and efficient code on 
embedded DSPs 
Amit Rao, Santosh Pande 

May 1999 ACM SIGPLAN Notices , Proceedings of the ACM SIGPLAN 1999 conference 
on Programming language design and implementation, volume 34 issue 5 



DSP architectures typically provide dedicated memory address generation units and indirect 
addressing modes with auto-increment and auto-decrement that subsume address 
arithmetic calculation. The heavy use of auto-increment and auto-decrement indirect 
addressing require DSP compilers to perform a careful placement of variables in storage to 
minimize address arithmetic instructions to generate compact and efficient DSP code. Liao et 
al. formulated the problem of storage assignment as the simple o ... 
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